Skip to content

Conversation

@nikhilsuri-db
Copy link
Contributor

@nikhilsuri-db nikhilsuri-db commented Nov 4, 2025

What type of PR is this?

  • Refactor
  • Feature
  • Bug Fix
  • Other

Description

This PR introduces a circuit breaker pattern to the telemetry system to prevent cascading failures and improve system resilience. The implementation uses the pybreaker library to monitor telemetry request failures and automatically open the circuit when failure rates exceed configurable thresholds, blocking further requests to protect downstream services. When the circuit is open, telemetry requests are temporarily blocked until the system recovers, at which point the circuit transitions to half-open for testing and eventually closes when normal operation resumes. The circuit breaker configuration is centralized with immutable settings, includes comprehensive logging for monitoring, and maintains full backward compatibility with existing telemetry functionality.

How is this tested?

  • Unit tests
  • E2E Tests
  • Manually
  • N/A

Related Tickets & Documents

https://docs.google.com/document/d/1ftRvby9bwDZzE3s1tOb4hJ4Pd9USiXskb9cDw-uQNPM/edit?usp=sharing

@github-actions
Copy link

github-actions bot commented Nov 4, 2025

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

@github-actions
Copy link

github-actions bot commented Nov 4, 2025

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

)


class CircuitBreakerStateListener(CircuitBreakerListener):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only used for logging purposed for now

logger = logging.getLogger(__name__)

# Circuit Breaker Configuration Constants
MINIMUM_CALLS = 20
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded values as do not want user to configure them anywhere in the driver

Signed-off-by: Nikhil Suri <[email protected]>
@github-actions
Copy link

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

{ version = ">=18.0.0", python = ">=3.13", optional=true }
]
pyjwt = "^2.0.0"
pybreaker = "^1.0.0"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vikrantpuppala Can we be sure that adding new library won't break any other client usage?

pool_connections: Optional[int] = None,
pool_maxsize: Optional[int] = None,
user_agent: Optional[str] = None,
telemetry_circuit_breaker_enabled: Optional[bool] = None,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should remove it from here and instead set this using a safe flag. @gopalldb what was the rolloutplan for CB in java?

logger.error("HTTP request failed after retries: %s", e)
raise RequestError(f"HTTP request failed: {e}")

# Try to extract HTTP status code from the MaxRetryError
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setting http_code here which will be used by CircuitBreaker to consider regression in telemetry endpoint



@dataclass(frozen=True)
class CircuitBreakerConfig:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Single object to store CB config

return breaker

@classmethod
def _create_noop_circuit_breaker(cls) -> CircuitBreaker:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In cases where client config setup failed or used by test cases.

Signed-off-by: Nikhil Suri <[email protected]>
Signed-off-by: Nikhil Suri <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants